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ABSTRACT 



To combat difficulties of dictated spelling tests, 
such as unreliable scoring due to illegible writing and the 
possibility of clues being provided through the enunciation of words 
by examiners, a technique was developed for writing experimental 
spelling tests by computer. Additionally, the diagnostic function of 
spelling scales was considered through the use of specific error 
categories in test construction. Five 50-item forms (each item 
containing from zero to four misspellings and representing a 
particular well-defined error category) were constructed and 
administered to 335 high school seniors, along with a standard 
battery of verbal tests. Although the machine-scorable tests raise 
their own difficulties — they may depend more on the student's 
proofreading ability than on his spelling proficiency and may, in the 
course of testing, teach the student misspelled words and so impair 
his spelling ability — the experimental tests were found to function 
as well as the standard spelling test. The 12 separate error 
categories did not function as independently as was hoped; the need 
for further experimentation with four ’'super” categories (addition, 
omission, inversion, and substitution) was indicated. (Twenty sample 
test items, tables of test results, and a list of references are 
included.) (Author/MF) 
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(9) inversion of letters (e.g. spelling "el" instead of ”ie") 

(10) Phonic substitution of a vowel (or vowel pair) for another 

(11) Phonic substitution of one consonant (or consonant pair) for 

(12) Pho^c r substitution S of n a vowel- consonant pair for another vowel- 
consonant pair similar in sound (including a.so vo.e.-pai 
substitutions for vowel- consonant pairs) 

These words with their descriptive error classifications were then punched 
on cards. Each card included one correctly spelled word, one misspelled form 
of that word, and the appropriate error classification. This classification 
included both the error category number and a letter code which identified h 
letters in the correctly spelled form which were involved or distorted in the 
misspelled form. For the addition or insertion of letters category the letter 
code identified the additional letter. In order to determine the actual sets 
of difficult letters involved in each error category, cards were then sorted 
by their number-letter classification. Within each numerical error category, 
all words were sorted by letter classification, into alphabetical order and 
grouped into the letter sub-groups within each numbered category. Sub-categories 
with fewer than six entries were dropped at this stage. Category 1, doubling 
vowels, was dropped because of the few examples available. All sub-categories 
were required to contain a minimum nunber of words so that a variety of items 

could be constructed from each sub- category. 

A computer program was then designed to construct tests from the remaining 

set of words and their misspelled forms according to the following specified- 



tions: 



(1) Fifty-five items were to be selected for each test 

(2) Each item was to have four words randomly drawn from the same 
error sub— category 

(3) To construct a Single item the program had to randomly select: 

(a) an error sub-category 

(b) four words from that error sub-category 

(c) either a correct or incorrect spelling of each of the four words. 
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Agreement of two early studies (Wager, 1912: Tidyman k Johnson, 1924) 
indicates that grouping words by similarity and structure is of assistance 
learning to spell such words* Foran'S (1934) survey of the grouping 
problem supported the belief that grouping should be practiced except in 
specific, disadvantageous cases, such as homonyms. Furthermore, if spelling 
mistakes can be classified into certain types, these are grounds for teach- 
ing words in groups in order to guard against mistakes made persistently 
(Foran, 1934). Numerous studies of spelling errors have shown that mistaken 
should be differentiated into categories of errors, which most writers define 
as classes in which mistakes may be grouped (Hollingworth, 1919; Book k 
Harter, 1926; Masters, 1927; Alper, 1942). Alper has further differentiated 
error categories into objective and subjective categories. She considers 
objective categories, such as omission of letters, failure to double a conso- 
nant, etc., easy to set up but limited in value, as opposed to subjective 
categories. Such subjective categories including tension at a hard spot of 
a word, as in the case of reversals or anticipated letter insertions, and 
negative transfer, as in the case of incorrect carry-over of the spelling of 
the root or base of a word, are difficult to set up, because operational 
definitions are difficult to formulate (Alper, 1942). Most of the error 
analysis studies attempt to assert that errors represent not so much an in- 
ability to spell some words as an indifferent attitude toward mistakes, 
themselves CForan, 1934). 

High correlations between various spelling measures which don’t differ 
in factorial composition suggest that spelling ability is not influenced by 
the method of testing (Ager k Allen, 1965). However, most results in 



experimental reports over many years show that although the scores on recognition 
and recall spelling tests often yield intercorrelations close to .90, the 
recognition scores are higher than the recall scores and the scores on par- 
ticular words vary widely between the two tests according to the difficulty of 
the words and maturity of the students (Lindcjuist & Cook, 1933; Weller & Broom, 
1934; Nelson & Denny, 1936; Moore, 1937, Sturdyvin, 1937; Jackson, 1943; 

Brody, 1944). Nevertheless, scoring tedium and expense of recall tests 
represented by the traditional dictation forms have led to wide use of machine** 
scorable tests which depend on student recognition of correctly spelled or 
misspelled words. Specific faults of the dictated tests include. (1) unre** 
liable scoring due to illegible writing, (2) presence of clues to correct 
spelling in the enunciation of problem words by the examiner, (3) difficulty 
of transferring item data from the test by hand to machine— scorable answer 
sheets or computer cards before statistical analysis of test outcomes can be 
undertaken, and (4) known deterioration of the spelling ability of the scorer 
himself over long periods of scanning misspelled words. However, there are 
also objections to machine* scorable tests: (1) The method lacks natural 

relevance, since the student’s performance is not an act of spelling and must 
depend to some extent on visual acuity or proofreading abi .ity, rather than 
spelling proficiency; (2) the student may learn, in the course of testing, 
misspelled words and so impair his spelling ability; (3) above all, the test 
does not yield the same difficulty coefficients for the same words when they 
are presented in misspelled or correctly spelled forms, i.e., students are 
more likely to accept the misspelling of a word than to reject its correct 
spelling (Wesman, 1946; Thomas, 1968). In a comparison of two tests in which 
half of the words correctly spelled in one test were incorrectly spelled in 



the other test and the other half which were incorrectly spelled in the former 
test were correctly spelled in the latter, Wesman (1946) found a difference in 
validity of correct and incorrect forms. A comparison of the coefficients of 
correlation for each item in its correct and incorrect forms against the total 
score yielded higher coefficients with the incorrect forms. In light of indica- 
tions that only the incorrectly spelled words on a machine- scorable recognition 
test contribute to the testee's score, Wesman suggested that a fully efficient 
spelling test would be of the true-false type in which the testee must 
consider both correct and incorrect words. Wesman, however, did find that a 
certain number of words were useful when they were correctly spelled (Wesman, 
1946). 

Not only can tests be scored rapidly and reliably by machine, a recent 
study demonstrated the usefulness of a computer in writing items (Fremer & 
Anastas io, 1968). Spelling items were used, because they seemed to have the 
fewest facets or dimensions. This study developed and programmed rules for 
item construction which were then applied to selected words. The item 
construction program was used with a set of 40 rules for creating misspelled 
words. In their evaluation of the computer output, experienced spelling test 
item writers concluded that the computer-generated lists of words with their 
misspellings would be a helpful resource even with a large proportion of 
implausible misspellings. Woodsen (1968) who recognized the problem of 
selecting and ordering items in test construction also provided a computer- 
programmed solution. The program input included the set of n items and their 
answers; the output included any number of items randomly selected from the 
set of n items and randomly ordered. 



The present study was undertaken to respond to several disadvantages of 
current apelling tests by developing a machine scorable teat in which randomly 
ordered items can be answered in a true-false or forced-choice fora. Iteae 
for such a test should be based on randomly selected words representing a 
wider range of critical words than contribute to present machine- scorable 
tests. Additional goals of the study were to consider the diagnostic function 
of spelling scales through the use of specific error categories in test 
construction and to investigate whether grouping of words, similarly misspelled 
into items representing distinct error categories would facilitate discrimina- 
tlon of the correctness or incorrectness of a word's spelling . 

Method 

The words which constituted the item pool were drawn from several sources 
of frequently misspelled words. A list of 606 "most frequently misspelled 
words in the English language** (Furness, 1964) was used as a starting place. 
From this same source the entire **Remington List of Words Most Frequently 
Misspelled by Adults** was also drawn. Other words were selected from a list 
of words most frequently taught in U.S. classrooms (Gates, 1937). Words from 
these lists rated below 6.0 (sixth grade) difficulty were not employed. 
Additional words were then drawn from a list of 30,000 words for which 
numerical ratings of frequency of occurrence in general and in four different 
sets of reading material were known (Thorndike A Lorge, 1944). In this 
selection words were included which occurred at least once per one million 
words but less often than 15 times per million words. Many of the words 
included in this last selection were words whose roots were among the shorter 



words which had been earlier excluded on the basis of their Gates gradings 
(below 6.0). 

On separate 3x5 index cards each word along with one or more common 
misspellings, difficult letters underscored, number of letters in the correctly 
spelled form of the word, and ratings from various sources were recorded. After 
words from the first two sources were recorded, cards containing words rated 
below 6.0 or having more than 15 or fewer than 4 letters were separated as 
reference for selecting words from the sub-set of the 30,000 word list. After 
words were selected from the latter source, a second set of words was deleted 
from the item pool. These words included proper nouns, separated or hyphenated 
words (e.g. all right, go-between, etc.) and those words whose misspelled forms 
were actually correctly spelled forms of different words (i.e. homonyms). 

In. order to categorize the errors in each misspelled form of a word, a 
list of 20 different error categories was ‘compiled from word structuring 
categories (Furness, 1964) and from Master’s Classification of Spelling Errors 
(Foran, 1934). 

Clerks in the Bureau of Testing were asked to assign each word to one of 
the 20 error categories. They specified the category by number and indicated 
the letter(s) involved in generating the misspelled form of the word. Because 
clerks encountered difficulty with the structuring categories, these categories 
were eliminated. Accordingly, the error categories actually used were: 

(1) Double vowel for a single vowel 

(2) Double consonant for a single consonant 

(3) Single vowel for a double vowel 

(4) Single consonant for a double consonant 

(5) Omission of vowels 

(6) Omission of consonants 

(7) Addition of vowels 

(8) Addition of consonants 
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(9) Inversion of letters (e.g. spelling "ei" instead of ie ) 

(10) Phonic substitution of a vowel (or vowel pair) for another 

(11) Phonic*^ substitution of one consonant (or consonant pair) for 

(12) Phonic* substitution 8 ©^^ vowel- consonant pair for another vowel- 
* consonant pair similar in sound (including also vo — 

substitutions for vowel- consonant pairs) 

These words with their descriptive error classifications were then punched 
on cards. Each card included one correctly spelled word, one misspelled form 
of that word, and the appropriate error classification. This classification 
included both the error category number and a letter code which identified the 
letters In the correctly spelled form which were involved or distorted in the 
misspelled form. For the addition or insertion of letters category the letter 
code identified the additional letter. In order to determine the actual sets 
of difficult letters involved in each error category, cards were then sorted 
by their number-letter classification. Within each numerical error category, 
all words were sorted by letter classification, into alphabetical order and 
grouped into the letter sub-groups within each numbered category. Sub-categories 
with fewer than six entries were dropped at this stage. Category 1, doubling 
vowels, was dropped because of the few examples available. All sub-categories 
were required to contain a minimum nunber of words so that a variety of items 

could be constructed from each sub-category. 

A computer program was then designed to construct tests from the remaining 

set of words and their misspelled forms according to the following specified- 



tions: 



(1) Fifty-five items were to be selected for each test 

(2) Each item was to have four words randomly drawn from the same 
error sub- category 

(3) To construct a single item the program had to randomly select: 

(a) an error sub-category 

(b) four words from that error sub-category 

(c) either a correct or incorrect spelling of each of th© four wo . 
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Ten sets of fifty-five items were to be constructed in this way. If any 
of the first fifty items of a given set included word repetitions (random 
selection did not preclude selecting the same word more than once) they were 
to be dropped and one of the last five items substituted: to produce a final form 
of each, test containing fifty items. Tests, test directions and answer sheets 
were prepared. For these tests subjects were to be asked to indicate whether 
each of the four words making up each item was correct or incorrect. Figure 
1 shows representative items drawn from Form D of these tests. 

In order to determine the length of time necessary to complete such a 
test, a preliminary session examined 50 college freshman subjects, randomly 
drawn from Introductory Psychology classes. All Ss received one form of 
the ten tests in this session. On the answer sheets provided for the Ss, 
a large blank square appeared below each set of ten item answer blanks. As 
Ss completed each ten items, they looked at the clock and recorded the time 
in the blank square. Because £ recorded the starting time, Ss only recorded 
the clock time after the tenth, twentieth, thirtieth, fortieth and fiftieth 
items., This testing session demonstrated that a fifteen minute time period 
would be sufficient for the average subject to complete the test. 

During Fall quarter 1968, high school students taking the Washington 
Pre-College battery on the UW campus were tested using five forms of this 
experimental spelling test. Each of these 335 high school seniors received 
one form of the five as the last test of their morning battery of tests. 

The experimental form appeared on a separate mimeographed form similar to the 
Washington Pre-College Spelling section. Ss were allowed 15 minutes to 
complete the experimental test. 
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Figure 1 



Test Items, Experimental Spelling Test, Form D 

intimate, harmed, cucumnber, obstinate 
suscriber, doubtful, dout, adominal 
underwear, boulaverd, congragetion, elagence 
haggard, strugle, sugestion, exagerate 
bugget, consider, divident, graduation 
deceive, recieving, sieze, bull it en 
almanac, particulary, amighty, regularly 
pronounciation, ancestral, coustom, abuse 
ambalunce, gaurantee, gradually, circalur 
faith, attainable, entertane, certenly 
arrears, spere, weary, harbor 
pospone, delightful, bankruptcy, fiction 
inform, income, confirm, informed 
emmanate, ommit, demmand, humility 
wiccid, gnit, remarkable, racket 
agreement, feble, exceedingly, indeed 
licquor, arctical, awkward, liquid 
earge, sea rum, cafetearia, anaesthetic 
principle, absorption, subscribtion, adobt 
secretary, massacre, acre, mediocer 



Word scores 



These tests were then scored two ways, by words and by items, 
represented total number of words correctly marked less the number of words 
incorrectly marked. Item scores represented the total number of items correct. 
To have an item correct all four words making up that item had to be correctly 



marked. 

Tn order to score these tests according to error categories the ten 
categories had to be regrouped into a smaller number of error super cate- 
gories. These error "super” categories and the regrouped error categories 



which they represented were: 



1 . Addition: 



Cat. 2: 



Cat. 7: 



Cat. 8: 



Insertion of a letter within the correct spelling of a 
word 

Double consonant instead of a single consonant 
(Insertion of a consonant identical to an adjacent 
consonant essential to the correct spelling of the 
word) 

Non-doubling vowel additions (Insertion of a vowel 
which may or may not be identical to vowel (s) 
essential to the correct spelling) 

Non- doubling consonant additions (Insertion of a 
consonant which may or may not be identical to 
consonant (s) essential to the correct spelling ) 



II . Omission: 

Will Ml HI— II • 



Cat. 3: 
Cat. 4: 
Cat. 5: 

Cat. 6: 



Omission of a letter essential to the correct 
spelling of a word 

Single vowel for double vowel (Omission of a vowel 
from a pair of identical vowels) 

Single; consonant for a double consonant (Omission of 
a consonant from a pair of identical consonants) 
Non-pa.ir-splitting vowel omissions (Omission of a 
vowel or vowel pair, which function as a single 
vowel, from the correct spelling) 

Non- pair- splitting consonant omissions (Omission of 
a consonant or consonant pair, which function as 
a single consonant, from the correct spelling) 



III. InversionstReversals of the order of two letters (including 

“ vowel pair or vowel/consonant combinations) which 
may or may not involve an arbitrary third letter; 
Cat. 0 defines this super category 
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IV. Substitutions : Replacement of one letter (or letter pair) 

essential to the correct spelling of the word 
with another letter (or letter pair) 

Cat. 10: Vowel substitutions (replacement of one vowel 

or pair with another vowel or pair) 

Cat. 11: Consonant substitutions (replacement of ©no con-* 

sonant or pair with another consonant or pair) 

Cat. 12: Vowel- consonant substitutions (replacement of a 

vowel- consonant pair with another vowel- consonant 
pair) 

All further analyses and discussion depend on these "super** categories and the 
ten original error categories are not considered further. Each test was then 
scored by items and words for each of the categories. Scores for total items 
and words correct and for items and words correct for eack category were 
compared within each experimental form. Total scores and category scores on 
each form were also compared with scores on four of the verbal tests included 
in the Washington Pre-College test battery. 



Results 

Mean scores for correct words and items, scored totally and for each 
category, are shown in Table 1. For the total scores th 0 maximum possible 
scores are the same for all forms: 200 words or 50 items. For categories, 

however, these maxima very from form to form and the number of items be- 
longing to each category in each form have been indicated by parenthesized 
numbers in Table 1. The corresponding number of words Is, of course, four 
times that number. In Table 1 means are followed by {3D tf s. 

To permit more direct comparison of these mean results, Table 2 reports 
the mean proportions of words and items answered correctly on each of the 
five forms. Because all of the words making up an item had to be correct 
before an item was regarded as answered correctly the proportion of items 
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TABLE 1 
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Mean Scores and Standard Deviations for Experimental Spelling Teats 





. ‘Form 1 
(N = 68) 


Form 2 
(N = 71) 


Form 3 
(N w 66) 


Form 4 
(N m 65) 


Form 5 
(N » 65) 


Total words 
i^ems 


118.015 

21.294 


34.56 

8.72 


11G.G56 

19.746 


35.73 

8.90 


110,061 

19.606 


QO OO 
OOiOfi 

9.68 


17 4 A4C 0*F 14 

AlllVIV UVf*t 

20.524 9.74 


115.892 45.19 
19.723 9.99 


Cat. 1: words 
items 


32.353 

5,721 

(14) 


10.98 

2.83 


37.042 

6.732 

(16) 


13,47 

3.50 


31,590 

5.121 

(16) 


12.35 

2.98 


37.631 43.52 
6.538 3.56 

(17) 


29,554 13.76 
4.769 2.89 
(15) 


Cat f II: words 
items 


41.485 

7.441 

(16) 


11.85 

3.30 


25.521 

4..?49 

(12) 


7.47 

1.98 


34.167 

5.924 

(16) 


12.77 

3.29 


34.461 10.39 
6,446 3.08 
(14) 


30.169 12.01* 
5.185 3.03 
(13) 


Cat. Ill: words 
items 


16.691 

3.059 

(8) 


6.02 

1,54 


14.606 

3.521 

(8) 


6.64 

1.65 


22.197 

4.258 

<J0) 


10.28 

2.52 


20,092 10.01 
3.585 2.41 
(10) 


18.108 7.48 
3.123 1.94 

(8) 


Cat. IV: words 
items 


27.750 

5.324 

(12) 


10.17 

2.53 


33.141 

5.958 

(14) 


11.77 

3.06 


21.803 

4.409 

(8) 


7.34 

2.14 


21,831 6.99 
3.954 1.95 

(9) 


35.600 13.99 
6.646 3.39 
(14) 



TABLE 2 



'Mean Proportion of Words and Items Correct 



Total words 
items 



Cat. I: words 
items 

Cat. II •. words 
items 

Cat. Ill: words 
items 



IV: words 
items 



Form 1 


Form 2 


Form 3 


.590 


.550 


.550 


.425 


.394 


.392 


.578 


.379 


.494 


.409 


.421 


.320 


.648 


.302 


.534 


.465 


.379 


.370 


.522 


.456 


.5155 


.382 


.315 


.426 


.578 


.592 


.681 


.443 


.426 


.551 



Form 4 


Form 5 


Average 


.570 


.1579 


.569 


,410 


.394 


.403 


,553 


.492 


.538 


.385 


.318 


.369 


.615 


.580 


.581 


.461 


.398 


.414 


.502 


.566 


.519 


• 358 


.390 


.374 


.4506 


.636 


.618 


.439 


.475 


.466 
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answered correctly is always smaller than the proportion of words answered 
correctly. Although there was form to form variability, Category IV items 
(or words), involving letter substitutions, appeared to be the easiest and 
Category III items, reversals of letters in a letter pair, the hardest. For 
the two forms on which Category IV items were not the easiest (Forms 1 and 4) 
this place was held by Category II, omissions of a letter, and for the two 
forms, 3 and 5, on which Category III was not the hardest this position was 
occupied by Category I, letter insertions. 

Although the five experimental spelling forms were distributed at 
random among the 335 high school seniors tested such randomization could 
not be expected to assure that the five resulting groups were matched on any 
relevant aptitudes or other attributes. Table 3 reports mean scores earned 
by the five groups on four of the verbal tests of the WPC battery. Scores 
are in a standard score system for which the statewide high school senior 
mean is 50 and the standard deviation 10. Table 3 suggests that the group 
assigned experimental form 3 were less verbally proficient and the group 
completing form 4 more skilled than the remaining Ss. 

Correlations among total word scores, total item scores, and word and 
item scores for each category within each form are shown in Tables 4-8. 
Correlations between item and word scores are indicated by parentheses.. The 
number of test items represented by each category is indicated in parentheses 
preceding the heading "items" for each category. Correlations betw«jen total 
scores and the several category scores were influenced, of course, by the 
number of items (or words) that category contributed to the total score. 

Thus, Category III, with relatively fewer items tended to correlate with the 
total scores at a lower level than the other categories. 




> 
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TABLE 3 



WPC Verbal Test Means and Standard Deviations 
for Experimental Spelling Test Groups 



WPC test 






Subjects 








Form 1 


Form 2 


Form 3 


Form 4 


Form 5 




(N » 68) 


Of = 71) 


(N « 63) 


(N = 65) 


(N * 65) 




X S.D. 


X S.D. 


X S.D. 


X S.D. 


X S.D. 


Reading Compre- 


46.500 11.09 


47.747 12.06 


44.894 10.46 


49.892 11.20 


46.385 11.14 


hension 












Vocabulary 


48.132 10.55 


47.303 10.31 


46.682 10.26 


49.738 10.28 


47.231 11.34 


English Usage 


45.294 8.21 


45.028 8.72 


44.258 8.33 


47.015 7.68 


44.957 3.58 


Spelling 


49.029 9.42 


49.239 9.73 


46.636 9.09 


49.738 8.78 


47.554 9.79 




TABLE 4 



15 



Correlations between Total Scores and Category Scores (Form 1) 



Category I Category II Category III Category IV 



Total words 
items 


words 

.904 

(.859) 


items 

(.795) 

.821 


words 

.935 

(.826) 


items 

(.898) 

.902 


words 

.690 

(.688) 


items 

(.626) 

.722 


words 

.920 

(.821) 


items 

(.836) 

.842 


Cat. 

(14) 


I: words 
items 


— - 


(.921) 


.802 

(.802) 


(.787) 

.710 


.534 

(.534) 


(.543) 

.523 


.765 

(.649) 


(.703) 

.632 


Cat. 

(16) 


II: words 
items 






— 


(.902) 


.539 

(.502) 


(.511) 

.599 


.842 

(.812) 


(.743) 

.791 


Cat. 

(8) 


III: words 
items 










— 


(.850) 


.546 

(.476) 


(.536) 

.507 


Cat. 

(12) 


IV: words 
items 














— 


(.899) 



TABLE 5 

Correlations between Total Scores and Category Scores (Form 2) 



Total words 
items 


Category I 
words items 
.932 (.889) 

(.882) .941 


Category II 
words items 

.845 (.700) 

(.835) *786 


Category III 
words items 
.843 (.691) 

(.819) *759 


Category IV 
words items 
.911 (.903) 

(.818) .907 


Cat. 

(16) 


£ words 
items 


— 


(.925) 


.836 

(.732) 


(.578) 

.668 


.722 

(.709) 


(.598) 

.638 


.801 
(i 739) 


(.806) 

.810 


Cat. 

(12) 


II: words 
items 






— 


(.886) 


.684 

(.586) 


(.578) 

.519 


.672 

(.511) 


(.703) 

.588 


Cat. 

(8) 


III: words 
items 










— - 


(.842) 


.699 

(.520) 


(.731) 

.600 



(. 912 ) 



Cat . IV : words 
(14) items 



TABLE 6 
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Correlations between Total Scores and Category Scores (Form 3) 




Total words 
items 


Category I 
words items 
.867 (.814) 

(.756) .753 


Category II 
words items 
.920 (.846) 

(.882) .896 


Category III 
words items 
.904 (.867) 

(.854) .887 


Category IV 
words items 
.890 (.762) 

(.780) .842 


Cat. 

(16) 


I: words 
items 


— 


(.893) 


.708 

(.681) 


(.627) 

.672 


.695 

(.636) 


(.648) 

.647 


.689 

(.729) 


(.592) 

.488 


Cat. 

(16) 


II: words 
items 






— 


(.908) 


.793 

(.753) 


(.780) 

.814 


.795 

(.729) 


(.715) 

.698 


Cat. 

(10) 


III: words 
items 










— 


(.944) 


.795 

(.772) 


(.715) 

.698 


Cat. 

(8) 


IV: words 
items 














— 


(.816) 



TABLE 7 

Correlations between Total Scores and Category Scores (Form 4) 



Total words 
items 



Cat. 


I: words 


(17) 


items 


Cat. 


II :word,s 


(14) 


items 


Cat,, 


III: words 


(10) 


items 



Category I 
words items 
.930 (.850) 

(.882) .901 

(.904) 



Category II 



words 


items 


.910 


(.900) 


(.872) 


.925 


.765 


(.774) 


(.708) 


.754 


...... 


(.916) 



Category III 



words 


items 


.989 


(.826) 


(.862) 


.845 


.770 


(.697) 


(.706) 


.689 


.771 


(.712) 


(.787) 


.723 




(.926) 



Category IV 



words 


items 


.874 


(.777) 


(.846) 


.824 


.774 


(.670) 


(.695) 


.645 


.764 


(.698) 


(.794) 


.757 


.709 


(.636) 


(.656) 


.595 


... 


(.886) 



Cat 

(9) 



IV: words 
items 
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TABLE 8 

Correlations between Total Scores and Category Scores (Form 5) 

Category I Category II Category III Category IV 





words 


items 


words 


Total words 


.894 


(.778) 


.888 


items 


(.900) 


.894 


(.892) 


Cat. I: words 
(15) items 




(.906) 


.828 

(.729) 



Cat. II: words 

(13) items 

Cat. Ill: words 
(8) items 

Cat. IV: words 

(14) items 



items words items words items 



(.772) 

.877 


.824 

(.868) 


(.713) 

.872 


.914 

(.845/ 


(.831) 

.887 


(.716) 

.682 


.817 

(.800) 


(.756) 

.783 


.851 

(.708) 


(.744) 

.910 


(.920) 


.779 

(.720) 


(.710) 

.721 


.799 

(.651) 


(.765) 

.682 




— 


(.869) 


.759 

(.653) 


(.723) 

.672 










(.919) 
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Between category correlations showed no clear pattern. At the word level, 
correlations tended to range between .7 and .8. The one exception was provided 
by Category III on Form 1. This 8 item category correlated only in the .50's 
with the other categories on this form. At the item level there is even 
greater variability but some evidence that categories I, II or IV correlated 
stronger with each other than did these categories with Category III . 

Tables 9-12 show correlations of experimental spelling scores with each 
of the four WPC verbal tests. Correlations of the experimental spelling tests 
with the WPC spelling 1 measure make up Table 9. Total item scores on the five 
experimental forms correlated from .6 for Form 3 to above .8 for Forms 2 and 5 
with the WPC measure, a fifty item, five choice test. Each WPC item consisted 
of four correctly spelled and one incorrectly spelled word with the Ss task to 
identify the misspelled word. The WPC spelling score, prior to standardization, 
was obtained by subtracting one-fourth of the incorrect ljr answered items from 
the number answered correctly. This is, of course, the common correction for 
guessing. Estimated reliability for the WPC test, based on odd-even correla- 
tion is .85 (WPC, 1968). 

The two experimental forms, 2 and 5, most highly correlated with tbe WPC 
test were marked by having fewer Category II (letter omission) and more 
Category IV (letter substitution) items than the remaining experimental tests. 
Possibly more responsible for the higher total correlations, however, were 
the correlations of Category I (letter addition) items with the WPC scores. 

Table 10 presents correlations of the spelling tests with the WPC 
vocabulary test • This latter test consists of 100 five choice antonym items 
with scores again adjusted for guessing. Reliability has been estimated 



3 



TABLE 9 



Experimental Spelling Test Correlations with 

- 

it. 



l 

i 

Form 1 



WPC Spelling Test 
(Decimal points omitted) 
Form 2 Form 3 



Form 



4 



Total words 
items 



Cat. I: 


words 

items 


Cat. II: 


words 

items 


Cat. Ill: 


words 

items 


Cat. IV: 


words 

items 



704 


844 


665 


850 


671 


803 


611 


818 


624 


737 


617 


632 


512 


749 


448 


682 


642 


716 


637 


756 



628 


665 


605 


673 


553 


588 


615 


600 


580 


615 


636 


598 


558 


649 


560 


616 


551 


548 


428 


542 



TABLE 10 






Experimental Spelling Test Correlations with 



I 

l 

a- 

f 



Form 1 



Total words 


618 


items 


570 


Cat. I: 


words 


550 




items 


495 


Cat. II: 


words 


552 




items 


544 


Cat. Ill: 


words 


488 




items 


406 


Cat. IV: 


words 


577 




items 


572 


WPC Spelling 


607 



WPC Vocabulary Test 



(Decimal points 


omitted) 




Form 2 


Form ,;3 


Form 


549 


586 


550 


536 


524 


543 


544 


533 


583 


545 


518 


539 


390 


507 


375 


369 


484 


411 


496 


546 


551 


445 


522 


509 


499 


534 


452 


470 


409 


466 


563 


518 


490 



ERiC 



Form 5 

736 

843 

783 

837 

725 

732 

752 

756 

632 

671 



FO&3 5* 

633 

698 

648 
651 

644 

603 

597 

528 

643 

649 

703 
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TABLE II 

Experimental Spelling Test Correlations with 
WFC English Usage Test 
(Decimal points omitted) 





Form 1 


Form 2 


Form 3 


Form 4 


Form 


Tota l words 


575 


667 


558 


654 


665 


items 


553 


667 


506 


654 


764 


Cat . I : words 


568 


659 


525 


651 


728 


items 


505 


648 


505 


586 


745 


Cat. II: words 


473 


521 


515 


507 


689 


items 


479 


493 


496 


558 


668 


Cat. Ill: words 


473 


507 


490 


657 


646 


items 


398 


460 


476 


609 


657 


Cat. IV ; words 


521 


616 


468 


523 


604 


items 


554 


617 


369 


560 


631 


WPC Spelling 


635 


714 


685 


629 


776 






TABLE 


12 








Experimental Spelling Test Correlations with 








WPC Reading Comprehension Test 










(Decimal points omitted) 








Form 1 


Form 2 


Form 3 


Form 4 


Form 


Total words 


605 


595 


562 


478 


518 


items 


531 


586 


510 


503 


606 


Cat. 1: words 


547 


626 


466 


516 


556 


items 


455 


613 


445 


481 


632 


Cat. II: words 


553 


505 


548 


317 


486 


items 


512 


472 


523 


388 


443 


Cat, III: words 


449 


414 


524 


427 


558 


items 


374 


370 


503 


422 


527 


Cat. IV: words 


560 


523 


484 


471 


520 


items 


546 


493 


375 


484 


539 


WPC Spelling 


523 


612 


556 


520 


643 



ERLC 
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at .95 (WPC, 1968). Total item scores for the experimental tests correlated 
with vocabulary at essentially the same level as the WPC spelling test with 
the vocabulary measure. There was, incidentally, considerable variability 
in these between WPC measure correlations. Estimates of the vocabulary- 
spelling correlation ranged from below .5 for the group taking experimental 
form 4 to above .7 for the group writing experimental test 5. 

This confounding of groups with forms was further illustrated by the 
correlations of the experimental spelling forms with the remaining two WPC 
verbal tests, English usage and reading .comprehension, presented in Tables 
11 and 12. Total item scores for experimental forms 2 and 5 were again more 
highly correlated with the WPC measures than the other forms. At the same 
time, however, it is clear that English usage and reading comprehension were 
also more highly correlated with the WPC spelling test for the groups taking 
those two experimental forms. 

Discussion 

Through computer programming, the present study developed a battery of 
machine scorable spelling tests which required testees to respond to each 
of the randomly selected words in the randomly ordered items by forced- 
choice (correct vs. incorrect) methods. This forced choice method required 
that Ss respond to each of four words in each of 50 items. The comparable 
50-item WPC spelling section which uses a multiple-choice answering method, 
required that Ss respond to only one ,, wrong ,! word in each of four words in 
each item. Initial testing with experimental battery demonstrated that the 
average testee needed at least fifteen minutes to complete the fifty items, 
while most testees required only ten minutes to complete the WPC spelling 
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section. This greater length of time necessary for experimental forms indicated 
that Ss had to respond to a greater number of critical words in the experimental 

forms • 

Each of the fifty items represented a particular error category and one 
set of one or two letters involved in the error. Because categories were 
represented by a variable number of items correlations involving the categories 
were difficult to interpret. Perhaps equal representation of each category 
would better indicate the predictive values for each category of total scores. 

High correlations among the error categories for adding , subtracting , 
and substituting letters erroneously, suggest that these errors involve similar 
processes. However, low correlations between Category III and the other 
categories suggest that the process of inverting the order of letters may be 
dissimilar to other error processes. Low mean scores on Category III across 
forms indicate that this type of error may also be more difficult to discriminate. 
Possibly the S recognizes that the correct letters are present in the word, but 
he fails to recognize that these letters appear in the wrong order. Consistently 
higher mean scores on Category IV suggests that substitution of wrong for 
correct letters is the form of error Vrhich is easiest to discriminate. 

On balance the computer constructed forms functioned essentially equivalent 
to the WPC spelling test suggesting that this mode of test construction is worth, 
further study. Separate error categories did not function as independently as 
was hoped. Greater care in balancing the content of tests as they are con* 
structed an d some experimentation with the definition of " super categories 
could produce superior results. 
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